Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the wrong Content-Length in python-server.py for non-ascii characters. #24480

Merged

Conversation

tomoki
Copy link

@tomoki tomoki commented Nov 22, 2024

Resolves: #24479

python-server.py currently uses sys.stdin.read for reading the input, and it receives the length in str (utf-8 string).
ref: https://docs.python.org/3/library/sys.html

On the other "Content-Length" is the size in bytes, therefore we should not pass content_length to sys.stdin.read. For example, print("こんにちは世界")'s length is 16 in str, but 30 in bytes.

>>> len('print("こんにちは世界")')
16
>>> len('print("こんにちは世界")'.encode())
30

This PR have two changes.

  1. Replace sys.stdin.read(content_length) with sys.stdin.buffer.read(content_length).decode().
  2. Make _send_message calculate "Content-Length" from bytes, not str.

By these changes, original issue #24479 can be resolved.

image

@vs-code-engineering vs-code-engineering bot added this to the November 2024 milestone Nov 25, 2024
@karthiknadig
Copy link
Member

Thanks for identifying the issue and providing a fix for it 🚀. We really appreciate it. Happy Coding!!!

@karthiknadig karthiknadig enabled auto-merge (squash) November 25, 2024 05:24
…ters.

Content-Length is the data in bytes, not len of str. We should use sys.stdin.buffer.read instead of sys.stdin.read to receive bytes.

_send_message should calculate "Content-Length" from bytes, not str.
We should use stdin.buffer.readline instead of stdin.readline because stdin.read* and stdin.buffer.read* should be used at the same time.  (stdin.read* refers the internal buffer)
@karthiknadig karthiknadig force-pushed the fix-content-length-for-python-repl-server branch from cd9a0cc to 1c8e5f4 Compare November 26, 2024 06:17
@tomoki
Copy link
Author

tomoki commented Nov 26, 2024

Thank you for your review!

Looks like the Github Action failed due to the rate limit exceeded.
https://github.com/microsoft/vscode-python/actions/runs/12024958777/job/33521533822?pr=24480

@karthiknadig karthiknadig merged commit dba0a4c into microsoft:main Nov 26, 2024
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue identified by VS Code Team member as probable bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Run Selection/Line in Python REPL" hangs up if the code contains non-ascii characters
4 participants